Improved Boosting Performance by Exclusion of Ambiguous Positive Examples
نویسندگان
چکیده
In visual object class recognition it is difficult to densely sample the set of positive examples. Therefore, frequently there will be areas of the feature space that are sparsely populated, in which uncommon examples are hard to disambiguate from surrounding negatives without overfitting. Boosting in particular struggles to learn optimal decision boundaries in the presence of such hard and ambiguous examples. We propose a twopass dataset pruning method for identifying ambiguous examples and subjecting them to an exclusion function, in order to obtain more optimal decision boundaries for existing boosting algorithms. We also provide an experimental comparison of different boosting algorithms on the VOC2007 dataset, training them with and without our proposed extension. Using our exclusion extension improves the performance of all the tested boosting algorithms except TangentBoost, without adding any additional test-time cost. In our experiments LogitBoost performs best overall and is also significantly improved by our extension. Our results also suggest that outlier exclusion is complementary to positive jittering and hard negative mining.
منابع مشابه
Improved Boosting Performance by Explicit Handling of Ambiguous Positive Examples
Visual classes naturally have ambiguous examples, that are different depending on feature and classifier and are hard to disambiguate from surrounding negatives without overfitting. Boosting in particular tends to overfit to such hard and ambiguous examples, due to its flexibility and typically aggressive loss functions. We propose a two-pass learning method for identifying ambiguous examples a...
متن کاملBoosTexter : A Boosting - based System for Text Categorization
This work focuses on algorithms which learn from examples to perform multiclass text and speech categorization tasks. Our approach is based on a new and improved family of boosting algorithms. We describe in detail an implementation, called BoosTexter, of the new boosting algorithms for text categorization tasks. We present results comparing the performance of BoosTexter and a number of other t...
متن کاملReal-time Face Detection Using Boosting Learning in Hierarchical Feature Spaces
Boosting-based methods have recently led to the state-of-the-art face detection systems. In these systems, weak classifiers to be boosted are based on simple, local, Haar-like features. However, it can be empirically observed that in later stages of the boosting process, the non-face examples collected by bootstrapping become very similar to the face examples, and the classification error of Ha...
متن کاملSimilarity-Based Approach for Positive and Unlabeled Learning
Positive and unlabelled learning (PU learning) has been investigated to deal with the situation where only the positive examples and the unlabelled examples are available. Most of the previous works focus on identifying some negative examples from the unlabelled data, so that the supervised learning methods can be applied to build a classifier. However, for the remaining unlabelled data, which ...
متن کاملBoosting Applied toe Word Sense Disambiguation
In this paper Schapire and Singer’s AdaBoost.MH boosting algorithm is applied to the Word Sense Disambiguation (WSD) problem. Initial experiments on a set of 15 selected polysemous words show that the boosting approach surpasses Naive Bayes and Exemplar–based approaches, which represent state–of–the–art accuracy on supervised WSD. In order to make boosting practical for a real learning domain o...
متن کامل